Co-Occurrence Cluster Features for Lexical Substitutions in Context

نویسنده

Christian Biemann

چکیده

This paper examines the influence of features based on clusters of co-occurrences for supervised Word Sense Disambiguation and Lexical Substitution. Cooccurrence cluster features are derived from clustering the local neighborhood of a target word in a co-occurrence graph based on a corpus in a completely unsupervised fashion. Clusters can be assigned in context and are used as features in a supervised WSD system. Experiments fitting a strong baseline system with these additional features are conducted on two datasets, showing improvements. Cooccurrence features are a simple way to mimic Topic Signatures (Martı́nez et al., 2008) without needing to construct resources manually. Further, a system is described that produces lexical substitutions in context with very high precision.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Creating a system for lexical substitutions from scratch using crowdsourcing

This article describes the creation and application of the Turk Bootstrap Word Sense Inventory for 397 frequent nouns, which is a publicly available resource for lexical substitution. This resource was acquired using Amazon Mechanical Turk. In a bootstrapping process with massive collaborative input, substitutions for target words in context are elicited and clustered by sense; then, more conte...

متن کامل

Choosing the Word Most Typical in Context Using a Lexical Co-Occurrence Network

This paper presents a partial solution to a component of the problem of lexical choice: choosing the synonym most typical, or expected, in context. We apply a new statistical approach to representing the context of a word through lexical co-occurrence networks. The implementation was trained and evaluated on a large corpus, and results show that the inclusion of second-order co-occurrence relat...

متن کامل

Lexical Semantics and Selection of TAM in Bantu Languages: A Case of Semantic Classification of Kiswahili Verbs

The existing literature on Bantu verbal semantics demonstrated that inherent semantic content of verbs pairs directly with the selection of tense, aspect and modality formatives in Bantu languages like Chasu, Lucazi, Lusamia, and Shiyeyi. Thus, the gist of this paper is the articulation of semantic classification of verbs in Kiswahili based on the selection of TAM types. This is because the sem...

متن کامل

Lexical Co-occurrence, Statistical Significance, and Word Association

Lexical co-occurrence is an important cue for detecting word associations. We present a theoretical framework for discovering statistically significant lexical co-occurrences from a given corpus. In contrast with the prevalent practice of giving weightage to unigram frequencies, we focus only on the documents containing both the terms (of a candidate bigram). We detect biases in span distributi...

متن کامل

Automated lexical adaptation and speaker clustering based on pronunciation habits for non-native speech recognition

This paper describes a method to improve speech recognition for non-native speech in a spoken dialogue system. Based on very general rules about possible vocalic substitutions, the frequency of occurrence of each substitution in different phonetic contexts is estimated on a small set of recordings. The most frequently observed substitutions are applied to the lexicon of the recognizer. Speakers...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

Co-Occurrence Cluster Features for Lexical Substitutions in Context

نویسنده

چکیده

منابع مشابه

Creating a system for lexical substitutions from scratch using crowdsourcing

Choosing the Word Most Typical in Context Using a Lexical Co-Occurrence Network

Lexical Semantics and Selection of TAM in Bantu Languages: A Case of Semantic Classification of Kiswahili Verbs

Lexical Co-occurrence, Statistical Significance, and Word Association

Automated lexical adaptation and speaker clustering based on pronunciation habits for non-native speech recognition

عنوان ژورنال:

اشتراک گذاری